Search CORE

446 research outputs found

Finding Equivalent Standards in Small Samples

Author: AA Davier von
AC Dwyer
FM Lord
GT LaFlair
MJ Kolen
ND Verhelst
PW Holland
RK Hambleton
RK Hambleton
S Kim
SA Livingston
Publication venue: Springer
Publication date: 06/07/2019
Field of study

Crossref

University of Twente Research Information

Introducing a framework to assess newly created questions with Natural Language Processing

Author: A Abyaa
CD Manning
GH Mc Laughlin
J Bergstra
J Verhagen
R Flesch
RC Atkinson
RK Hambleton
X Wang
Y Mao
Z Huang
Publication venue
Publication date: 01/01/2020
Field of study

Statistical models such as those derived from Item Response Theory (IRT) enable the assessment of students on a specific subject, which can be useful for several purposes (e.g., learning path customization, drop-out prediction). However, the questions have to be assessed as well and, although it is possible to estimate with IRT the characteristics of questions that have already been answered by several students, this technique cannot be used on newly generated questions. In this paper, we propose a framework to train and evaluate models for estimating the difficulty and discrimination of newly created Multiple Choice Questions by extracting meaningful features from the text of the question and of the possible choices. We implement one model using this framework and test it on a real-world dataset provided by CloudAcademy, showing that it outperforms previously proposed models, reducing by 6.7% the RMSE for difficulty estimation and by 10.8% the RMSE for discrimination estimation. We also present the results of an ablation study performed to support our features choice and to show the effects of different characteristics of the questions' text on difficulty and discrimination.Comment: Accepted at the International Conference of Artificial Intelligence in Educatio

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Politecnico di Milano

A meeting report: OECD-GESIS Seminar on Translating and Adapting Instruments in Large-Scale Assessments (2018)

Author: B Upsing
C Acquadro
D Behr
International Test Commission
International Test Commission
JA Harkness
K Boehnke
RK Hambleton
T Smith
VH Fetvadjiev
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This report summarizes the main themes and conclusions from the OECD-GESIS Seminar on Translating and Adapting Instruments in Large-Scale Assessments, which took place at the Organization for Economic Co-operation and Development (OECD), Paris, in June 2018. The five sessions covered the topics (1) etic (universal) vs. emic (culture-specific) measurement instruments, (2) language- and culture-sensitive development of measurement instruments, (3) international guidelines vs. implementation in countries and by translators, (4) tools and technological developments, and (5) quality control of translations. Key players in the field presented on best practice, lessons learned, and innovations and also made suggestions for moving the field forward

Crossref

SSOAR - Social Science Open Access Repository

Multidimensional Adaptive Testing with a Minimum Error-Variance Criterion

Author: Andersen EB
Fan M
Fraser C
Hambleton RK
Lord FM
McKinley RL
Sympson JM
Wainer H
Wilson D
Wim J. van der Linden
Publication venue: 'American Educational Research Association (AERA)'
Publication date
Field of study

Crossref

Detection of determinant genes and diagnostic via Item Response Theory

Author: Andrade DF
Andrade DF
Bock RD
Carlos Alberto de Bragança Pereira
Chow YS
Dalton Francisco de Andrade
Doornik JA
Hambleton RK
Héliton Ribeiro Tavares
Lord FM
Paas LJ
Sanathanan L
Tavares HR
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/01/2004
Field of study

Crossref

Using item response theory to explore the psychometric properties of extended matching questions examination in undergraduate medical education

Author: A Van Alphen
B Chopin
BA Fenderson
BD Wright
C Hagquist
CD Kreiter
D Andrich
D Andrich
D Andrich
DL Streiner
G Karabastos
G Rasch
G Rasch
GE Miller
GE Stone
General Medical Council
J Dobby
J Kehoe
J Umar
JC Impara
JM Bland
M Banerji
M Kane
R Hambleton
RD Luce
RF Burton
RK Hambelton
RM Smith
S Alagumalai
SM Case
SM Case
TA Van Batenburg
V Wass
W Wang
WH Angoff
WJ Popham
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

BACKGROUND: As assessment has been shown to direct learning, it is critical that the examinations developed to test clinical competence in medical undergraduates are valid and reliable. The use of extended matching questions (EMQ) has been advocated to overcome some of the criticisms of using multiple-choice questions to test factual and applied knowledge. METHODS: We analysed the results from the Extended Matching Questions Examination taken by 4th year undergraduate medical students in the academic year 2001 to 2002. Rasch analysis was used to examine whether the set of questions used in the examination mapped on to a unidimensional scale, the degree of difficulty of questions within and between the various medical and surgical specialties and the pattern of responses within individual questions to assess the impact of the distractor options. RESULTS: Analysis of a subset of items and of the full examination demonstrated internal construct validity and the absence of bias on the majority of questions. Three main patterns of response selection were identified. CONCLUSION: Modern psychometric methods based upon the work of Rasch provide a useful approach to the calibration and analysis of EMQ undergraduate medical assessments. The approach allows for a formal test of the unidimensionality of the questions and thus the validity of the summed score. Given the metric calibration which follows fit to the model, it also allows for the establishment of items banks to facilitate continuity and equity in exam standards

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Research Repository

White Rose Research Online

A collaborative comparison of Objective Structured Clinical Examination (OSCE) standard setting methods at Australian medical schools

Author: Angoff WH.
Barman A.
Bunmi Sherifat Malau-Aduli
Cees van der Vleuten
Cizek GJ.
Clare Heal
David L. Garne
Hambleton RK
Hofstee WKB.
Karen D’Souza
Livingstone SA
Malau-Aduli BS
Peta-Ann Teague
Richard Turner
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

Background: A key issue underpinning the usefulness of the OSCE assessment to medical education is standard-setting, but the majority of standard-setting methods remain challenging for performance assessment because they produce varying passing marks. Several studies have compared standard setting methods; however, most of these studies are limited by their experimental scope, or use data on examinee performance at a single OSCE station or from a single medical school. This collaborative study between ten Australian medical schools investigated the effect of standard-setting methods on OSCE cut scores and failure rates. Methods: This research used 5,256 examinee scores from seven shared OSCE stations to calculate cut scores and failure rates using two different compromise standard-setting methods, namely the Borderline Regression and Cohen's methods. Results: The results of this study indicate that Cohen's method yields similar outcomes to the Borderline Regression method, particularly for large examinee cohort sizes. However, with lower examinee numbers on a station, the Borderline Regression method resulted in higher cut scores and larger difference margins in the failure rates. Conclusion: Cohen's method yields similar outcomes as the Borderline Regression method and its application for benchmarking purposes and in resource-limited settings is justifiable, particularly with large examinee numbers

Maastricht University Research Portal

Deakin Research Online

Crossref

ResearchOnline at James Cook University

University of Tasmania Open Access Repository

Radboud Repository

Research Online

An analysis of Turkish students’ perception of intelligence from primary school to university

Author: Brown TA
Dweck CS
Hambleton RK
Hymer B
Kline RB
Lee K
Lynott DJ
Tabachnick BG
Uhl NP
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Testing and Assessment in an International Context: Cross- and Multi-cultural Issues

Author: A Jensen
AJ Marsella
AJ Metz
American Psychological Association
BM Byrne
C Arbona
DS Osborn
E. Heim
F Schwabe
F Zercher
FJR Vijver Van de
FJR Vijver Van de
FJR Vijver Van de
FJR Vijver Van de
FJR Vijver Van de
FTL Leong
FTL Leong
J Berry
J Guichard
J Rossier
J Rossier
J Rounds
JM Ryan
Jonas Masdonati
Jérôme Rossier
Kokou A. Atitsogbe
L Cronbach
L Gjersing
L Goldman
L Long
L Long
L Vygotsky
LY Flores
M Savickas
M Watson
M Yasui
Marcelo Afonso Ribeiro
Mark L. Savickas
ME Duarte
ME Duarte
ME Duarte
ME Duarte
ME Strauss
ML Savickas
NA Fouad
O Warner
OECD
PI Armstrong
R Cattell
R Toit du
RE Millsap
RK Hambleton
RK Hambleton
RK Hambleton
S Collins
S Einarsdóttir
S Kitayama
S Laher
SH Schwartz
SH Schwartz
SH Schwartz
T Oakland
T Raykov
TJG Tracey
TP Johnson
V Hedrih
VH Fetvadjiev
W Meredith
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Globalisation, increase of migration flows, and the concurrent worldwide competitiveness impose rethinking of testing and assessment procedures and practices in an international and multicultural context. This chapter reviews the methodological and practical implications for psychological assessment in the field of career guidance. The methodological implications are numerous and several aspects have to be considered, such as cross-cultural equivalence or construct, method, and item bias. Moreover, the construct of culture by itself is difficult to define and difficult to measure. In order to provide non-discriminatory assessment, counsellors should develop their clinical cross-cultural competencies, develop more specific intervention strategies, and respect cultural differences. Several suggestions are given concerning translation and adaptation of psychological instruments, developing culture specific measures, and the use of these instruments. More research in this field should use mixed methods, multi-centric designs, and consider emic and etic psychological variables. A multidisciplinary approach might also allow identifying culture specific and ecological meaningful constructs. Non-discriminatory assessment implies considering the influence and interaction of personal characteristics and environmental factors

Crossref

Serveur académique lausannois

Design and Key Features of the PIAAC Survey of Adults

Author: A Dickerson
C Goldin
DJ Weiss
FM Lord
H Chen
H Wainer
I Gal
J Bernstein
J Moore
J Strucker
JR Pleis
K Yamamoto
M Wu
NE Betz
Organisation for Economic Co-operation and Development (OECD)
Organisation for Economic Co-operation and Development (OECD)
R Asseburg
RD Arvey
RK Hambleton
S Grenier
Statistics Canada and Organisation for Economic Co-operation and Development (OECD)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

This chapter gives an overview of the most important features of the Programme for the International Assessment of Adult Competencies (PIAAC) survey as it pertains to two main goals. First, only a well-designed survey will lead to accurate and comparable test scores across different countries and languages both within and across assessment cycles. Second, only an understanding of its complex survey design will lead to proper use of the PIAAC data in secondary analyses and meaningful interpretation of results by psychometricians, data analysts, scientists, and policymakers. The chapter begins with a brief introduction to the PIAAC survey followed by an overview of the background questionnaire and the cognitive measures. The cognitive measures are then compared to what was assessed in previous international adult surveys. Key features of the assessment design are discussed followed by a section describing what could be done to improve future PIAAC cycles

Crossref

SSOAR - Social Science Open Access Repository